Picture for Xiangliang Zhang

Xiangliang Zhang

KAUST, Saudi Arabia

AIRGuard: Guarding Agent Actions with Runtime Authority Control

Add code
May 27, 2026
Viaarxiv icon

JobBench: Aligning Agent Work With Human Will

Add code
May 25, 2026
Viaarxiv icon

AgentTrap: Measuring Runtime Trust Failures in Third-Party Agent Skills

Add code
May 13, 2026
Viaarxiv icon

Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?

Add code
May 12, 2026
Viaarxiv icon

AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration -- Learning from Cheap, Optimizing Expensive

Add code
May 12, 2026
Viaarxiv icon

Why Search When You Can Transfer? Amortized Agentic Workflow Design from Structural Priors

Add code
Apr 27, 2026
Viaarxiv icon

Too Correct to Learn: Reinforcement Learning on Saturated Reasoning Data

Add code
Apr 20, 2026
Viaarxiv icon

PolicyLLM: Towards Excellent Comprehension of Public Policy for Large Language Models

Add code
Apr 14, 2026
Viaarxiv icon

Guardian-as-an-Advisor: Advancing Next-Generation Guardian Models for Trustworthy LLMs

Add code
Apr 08, 2026
Viaarxiv icon

SenseMath: Do LLMs Have Number Sense? Evaluating Shortcut Use, Judgment, and Generation

Add code
Apr 02, 2026
Viaarxiv icon